翻訳と辞書
Words near each other
・ Comparison of the Baltic states
・ Comparison of the Hare and Droop quotas
・ Comparison of the health care systems in Canada and the United States
・ Comparison of the imperial and US customary measurement systems
・ Comparison of the Java and .NET platforms
・ Comparison of the Nordic countries
・ Comparison of the rise of Buddhism in China and Japan
・ Comparison of time-tracking software
・ Comparison of TLS implementations
・ Comparison of top chess players throughout history
・ Comparison of topologies
・ Comparison of Toyota hybrids
・ Comparison of train and tram tracks
・ Comparison of type systems
・ Comparison of U.S. state governments
Comparison of Unicode encodings
・ Comparison of United Kingdom and United States military ranks
・ Comparison of United States incarceration rate with other countries
・ Comparison of United States presidential candidates, 2008
・ Comparison of UPnP AV media servers
・ Comparison of US and Chinese Military Armed Forces
・ Comparison of usability evaluation methods
・ Comparison of Usenet newsreaders
・ Comparison of user interface markup languages
・ Comparison of vector algebra and geometric algebra
・ Comparison of Vector Formats (GIS)
・ Comparison of vector graphics editors
・ Comparison of version control software
・ Comparison of video codecs
・ Comparison of video container formats


Dictionary Lists
翻訳と辞書 辞書検索 [ 開発暫定版 ]
スポンサード リンク

Comparison of Unicode encodings : ウィキペディア英語版
Comparison of Unicode encodings

This article compares Unicode encodings. Two situations are considered: 8-bit-clean environments, and environments that forbid use of byte values that have the high bit set. Originally such prohibitions were to allow for links that used only seven data bits, but they remain in the standards and so software must generate messages that comply with the restrictions. Standard Compression Scheme for Unicode and Binary Ordered Compression for Unicode are excluded from the comparison tables because it is difficult to simply quantify their size.
==Compatibility issues==
A UTF-8 file that contains only ASCII characters is identical to an ASCII file. Legacy programs can generally handle UTF-8 encoded files, even if they contain non-ASCII characters. For instance, the C printf function can print a UTF-8 string, as it only looks for the ASCII '%' character to define a formatting string, and prints all other bytes unchanged, thus non-ASCII characters will be output unchanged.
UTF-16 and UTF-32 are incompatible with ASCII files, and thus require Unicode-aware programs to display, print and manipulate them, even if the file is known to contain only characters in the ASCII subset. Because they contain many zero bytes, the strings cannot be manipulated by normal null-terminated string handling for even simple operations such as copy.
Therefore, even on most UTF-16 systems such as Windows and Java, UTF-16 text files are not common; older 8-bit encodings such as ASCII or ISO-8859-1 are still used for text files without supporting all the characters of Unicode, or UTF-8 is used that does. One of the few counterexamples of a UTF-16 file is the "strings" file used by Mac OS X (10.3 and later) applications for lookup of internationalized versions of messages, these default to UTF-16 and "files encoded using UTF-8 are not guaranteed to work. When in doubt, encode the file using UTF-16".〔(Apple Developer Connection: Internationalization Programming Topics: Strings Files )〕
XML is, by default, encoded as UTF-8, and all XML processors must at least support UTF-8 (including US-ASCII by definition) and UTF-16.

抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)
ウィキペディアで「Comparison of Unicode encodings」の詳細全文を読む



スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース

Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.